130 research outputs found

    Efficient pebbling for list traversal synopses

    Full text link
    We show how to support efficient back traversal in a unidirectional list, using small memory and with essentially no slowdown in forward steps. Using O(logn)O(\log n) memory for a list of size nn, the ii'th back-step from the farthest point reached so far takes O(logi)O(\log i) time in the worst case, while the overhead per forward step is at most ϵ\epsilon for arbitrary small constant ϵ>0\epsilon>0. An arbitrary sequence of forward and back steps is allowed. A full trade-off between memory usage and time per back-step is presented: kk vs. kn1/kkn^{1/k} and vice versa. Our algorithms are based on a novel pebbling technique which moves pebbles on a virtual binary, or tt-ary, tree that can only be traversed in a pre-order fashion. The compact data structures used by the pebbling algorithms, called list traversal synopses, extend to general directed graphs, and have other interesting applications, including memory efficient hash-chain implementation. Perhaps the most surprising application is in showing that for any program, arbitrary rollback steps can be efficiently supported with small overhead in memory, and marginal overhead in its ordinary execution. More concretely: Let PP be a program that runs for at most TT steps, using memory of size MM. Then, at the cost of recording the input used by the program, and increasing the memory by a factor of O(logT)O(\log T) to O(MlogT)O(M \log T), the program PP can be extended to support an arbitrary sequence of forward execution and rollback steps: the ii'th rollback step takes O(logi)O(\log i) time in the worst case, while forward steps take O(1) time in the worst case, and 1+ϵ1+\epsilon amortized time per step.Comment: 27 page

    The streaming kk-mismatch problem

    Get PDF
    We consider the streaming complexity of a fundamental task in approximate pattern matching: the kk-mismatch problem. It asks to compute Hamming distances between a pattern of length nn and all length-nn substrings of a text for which the Hamming distance does not exceed a given threshold kk. In our problem formulation, we report not only the Hamming distance but also, on demand, the full \emph{mismatch information}, that is the list of mismatched pairs of symbols and their indices. The twin challenges of streaming pattern matching derive from the need both to achieve small working space and also to guarantee that every arriving input symbol is processed quickly. We present a streaming algorithm for the kk-mismatch problem which uses O(klognlognk)O(k\log{n}\log\frac{n}{k}) bits of space and spends \ourcomplexity time on each symbol of the input stream, which consists of the pattern followed by the text. The running time almost matches the classic offline solution and the space usage is within a logarithmic factor of optimal. Our new algorithm therefore effectively resolves and also extends an open problem first posed in FOCS'09. En route to this solution, we also give a deterministic O(k(lognk+logΣ))O( k (\log \frac{n}{k} + \log |\Sigma|) )-bit encoding of all the alignments with Hamming distance at most kk of a length-nn pattern within a text of length O(n)O(n). This secondary result provides an optimal solution to a natural communication complexity problem which may be of independent interest.Comment: 27 page

    Sublinear Distance Labeling

    Get PDF

    A Simple Algorithm for Approximating the Text-To-Pattern Hamming Distance

    Get PDF
    The algorithmic task of computing the Hamming distance between a given pattern of length m and each location in a text of length n, both over a general alphabet Sigma, is one of the most fundamental algorithmic tasks in string algorithms. The fastest known runtime for exact computation is tilde O(nsqrt m). We recently introduced a complicated randomized algorithm for obtaining a (1 +/- eps) approximation for each location in the text in O( (n/eps) log(1/eps) log n log m log |Sigma|) total time, breaking a barrier that stood for 22 years. In this paper, we introduce an elementary and simple randomized algorithm that takes O((n/eps) log n log m) time

    Explicit Non-Adaptive Combinatorial Group Testing Schemes

    Get PDF
    Group testing is a long studied problem in combinatorics: A small set of rr ill people should be identified out of the whole (nn people) by using only queries (tests) of the form "Does set X contain an ill human?". In this paper we provide an explicit construction of a testing scheme which is better (smaller) than any known explicit construction. This scheme has \bigT{\min[r^2 \ln n,n]} tests which is as many as the best non-explicit schemes have. In our construction we use a fact that may have a value by its own right: Linear error-correction codes with parameters [m,k,δm]q[m,k,\delta m]_q meeting the Gilbert-Varshamov bound may be constructed quite efficiently, in \bigT{q^km} time.Comment: 15 pages, accepted to ICALP 200
    corecore